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Abstract. As software systems become more complex, there is an in- 
creasing need for new static analyses. Thanks to the declarative style, 
logic programming is an attractive formalism for specifying them. How- 
ever, prior work on using logic programming for static analysis focused 
on analyses defined over some powerset domain, which is quite limiting. 
In this paper we present a logic that lifts this restriction, called Lattice 
based Least Fixed Point Logic (LLFP), that allows interpretations over 
any complete lattice satisfying Ascending Chain Condition. The main 
theoretical contribution is a Moore Family result that guarantees that 
there always is a unique least solution for a given problem. Another con- 
tribution is the development of solving algorithm that computes the least 
model of LLFP formulae guaranteed by the Moore Family result. 
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1 Introduction 

Nowadays, we heavily rely on software systems. At the same time they become 
bigger and more complex, and hence the number of potential errors increases. 
In order to achieve more reliable systems, formal verification techniques may 
be applied. A widely used verification technique is static analysis, which rea- 
sons about system behavior without executing it. It is performed statically at 
compile-time, and it computes safe approximations of values or behaviors that 
may occur at run-time. Static analysis is increasingly recognized as a fundamen- 
tal technique for program verification, bug detection, compiler optimization and 
software understanding. 

Unfortunately, developing new static analyses is difficult and error-prone. 
In order to overcome that problem it is desirable to implement prototypes of 
analyses that are easy to analyse for complexity and correctness. Since analysis 
specifications are generally written in a declarative style, logic programming 
presents an attractive model for producing executable specifications of analyses. 
Furthermore, thanks to the advances in logic programming, the associated solvers 
became more efficient. 

In this paper we present a framework that facilitates rapid prototyping of new 
static analyses. The approach taken falls within the Abstract Interpretation |8l7j 
framework, thus there always is a unique best solution to the analysis problem 
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considered. The framework consists of the Lattice based Least Fixed Point Logic 
(LLFP) and the associated solver. The most prominent feature of the LLFP 
logic is its interpretation over complete lattices satisfying the Ascending Chain 
Condition, which makes it possible to express sophisticated analyses in LLFP. 
The solver combines a continuation passing style algorithm with propagation of 
differences, and uses prefix trees as its main data structure. The applicability 
of the framework is illustrated by presenting a specification of interval analysis, 
which could not be specified using logics traditionally used such as Datalog |1|5) 
or ALFP PI]. 

This paper is organized as follows. In Section [2] we present the problem we 
want to solve and indicate a solution. Section [3] introduces syntax and semantics 
of the LLFP logic. In Section [4] we establish our main theoretical contribution; 
namely a Moore Family result for LLFP. Section [5] describes the solving algo- 
rithm for LLFP. We conclude and discuss future work in Section [S] 

2 The problem 

There is an immense body of work on using logic for specifying static analyses. 
However, logics traditionally used have some limitations. To illustrate the prob- 
lem, let us briefly introduce the Alternation free Least Fixed Point Logic (ALFP) 
and then try to devise the ALFP specifications of two analyses: detection of signs 
and interval analysis. 

Alternation-free Least Fixed Point Logic. Many static analyses can be succinctly 
expressed using Alternation- free Least Fixed Point Logic (ALFP) [18). The logic 
is a generalization of Datalog [115] and it has proved to have a number of prop- 
erties essential for specifying static analyses such as the existence of a unique 
least model. The syntax of ALFP is given by 

v ::= x | a 

pre ::= R(vi, ...,v k ) | ->R(vi, ...,v k ) | prei A pre 2 

| prei V pre 2 | 3a; : pre | v\ = i>2 | v\ ^ v 2 
cl R(vi, . . . , v k ) | 1 | cl\ A cl 2 \ pre cl | Vx : cl 

where we write a for constants, x for analysis variables, v for values, R for 
predicates, pre for preconditions, and cl for clauses. The clauses are interpreted 
over a universe U of constants, a £lA. The interpretation is given in terms of 
satisfaction relations (/?, a) \= pre and (p, a) |= cl where p is an interpretation of 
predicates, and a is an interpretation of variables. The definition is standard and 
hence omitted. Due to the use of negation, we impose a stratification condition 
similar to the one in Datalog |l|5j . This intuitively means that no predicate 
depends on the negation of itself. We refer to [IB] for more details. 

Example 1. Using the notion of stratification we can define equality E and non- 
equality N predicates as follows 



(Vx : E(x, x)) A (Vx : My : -<E(x, y) =^ N(x, y)) 
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The formula is stratified, since predicate E is fully asserted before it is negatively 
queried in the clause asserting predicate N. 

Detection of signs analysis. Now, let us consider the ALFP formulation of the 
detection of signs analysis. The analysis aims to determine for each program 
point and each variable, the possible sign (negative, zero or positive) that the 
variable may have whenever the execution reaches that point. In the following, 
we use program graphs as representation of the program under consideration 
[2]. Compared to the classical flow graphs [14116) . the main difference is that in 
the program graphs the actions label the edges rather than the states. Here we 
focus on three types of actions: assignments, boolean expressions and the skip 
action. For simplicity we assume that assignments are in three-address form. The 
analysis is defined by predicate A, and we begin with initializing the initial state 
of the program graph, qq, with all possible signs for all variables v occurring in 
the underlying program graph 

f\ A(q ,V,-) A A(q o ,v,0) A A(q ,v,+) 

VE Var 

Intuitively it indicates that at state go all variables may have all possible values. 
Now we consider the ALFP specifications for each type of action. Whenever we 

, x:—y*z . . . 

have q s > qt m the program graph we generate 

V.s : Ms y : \fs z : A(q s ,y, s y ) AA(q s ,z,s z ) A R*(s y , s z , s) A(q t ,x,s) A 
Vw : V.s : v ^ x A A(q s , v, s) A(q t , v, s) 

where we assume that we have a relation for each type of arithmetic operation, 
denoted by i?* in the above formula. The first conjunct states that for all possible 
values s, s y and s 2 , if at state q s the signs of variables y and z are s y and s z , 
respectively, and the sign of the result of evaluating the arithmetic operation * 
is s, then at state q t variable x will have sign s. The second conjunct expresses 
that for all variables v and signs s, if the variable is different than x and at state 
q s it has sign s, then it will have the same sign at state qt- Similarly, whenever 

we have q s — > qt or q s > q t in the program graph, we generate a clause 

Vw : Vs : A(q s , v, s) A(q t , v, s) 

The clause simply propagates the signs of all variables along the edge of the 
program graph, without altering it. 

In a similar manner we could formulate other analyses such as pointer analysis 
|21|15|20|4] . or classical data flow analyses |13|19j . In more general terms, logics 
traditionally used, e.g. Datalog and ALFP, can be used for specifying analyses 
defined over a powerset domain. However, as we show in the next paragraph, 
many interesting analyses are defined over some mathematical structure such as 
a complete lattice. Thus, let us now consider interval analysis as an example of 
such an analysis. 
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Interval analysis. The purpose of interval analysis is to determine for each pro- 
gram point an interval containing possible values of variables whenever that 
point is reached during run-time execution. The analysis results can be used 
for Array Bound Analysis, which determines whether an array index is always 
within the bounds of the array. If this is the case, a run-time check can safely 
be eliminated, which makes code more efficient. 

We begin with defining the complete lattice (Interval, E/) over which the 
analysis is defined. The underlying set is 

Interval = _L U {[zi, z^\ \ z\ < Z2, z\ £ Z U {—00}, zi £ Z U {00}} 

where Z is a finite subset of integers, Z C Z, and the integer ordering < on 
Z is extended to an ordering on Z' = Z U {—00, 00} by taking for all z £ Z: 
—00 < z, z < 00 and —00 < 00. In the above definition, _L denotes an empty 
interval, whereas [2:1,2:2] is the interval from z± to Z2 including the end points, 
where 21,2:2 £ Z. The interval [—00,00] is equivalent to the top element, T. In 
the following we use i to denote an interval from Interval. The partial ordering 
\—i in Interval uses operations inf and sup 



Unfortunately, due to limited expressiveness of the ALFP logic (and similarly 
Datalog), interval analysis can not be specified using these formalisms. Hence, in 
the following section we present a solution for that problem; namely we introduce 
LLFP logic. 

3 Syntax and Semantics 

In the previous section we briefly introduced ALFP logic, which is interpreted 
over a finite universe of atoms; in this section we present an extension of ALFP 
called LLFP allowing interpretations over complete lattices satisfying Ascending 
Chain Condition. We also allow function terms as arguments of relations. Since 
functions over the universe U can be represented as relations, we do not consider 
them here. Instead, we focus on functions over a complete lattice [/] : C k — > £, 
and we restrict our attention to monotone functions only. Recall that a function 
[/] : C\ —> £2 between partially ordered sets £1 = (£1, El) and £2 = (£2, E2) 
is monotone if 




and is defined as 



H Ei *2 iff hiffe) < inf(?i) A sup(«i) < sup(«2) 



The intuition behind the partial ordering in Interval is that 



ii E/ *2 ^ {z I z belongs to i±} C {z \ z belongs to 12} 
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Definition 1. A complete lattice £ = (£, C) = (£, C, |J, [~|, _L, T) is a partially 
ordered set {£, C) smc/i that all subsets have least upper bounds as well as greatest 
lower bounds. 

A subset Y C £ of a partially ordered set £ = (£, C) is a chain if 

Hence, a chain is a (possibly empty) subset of £ that is totally ordered. A 
sequence (/„)„ of elements in £ is an ascending chain if 

n < m => l n C i m 

We say that a sequence (l n ) n eventually stabilises if and only if 

3n e IN : Vn G 1ST : n > no l n = l no 

The partially ordered set £ satisfies Ascending Chain Condition if and only if all 
ascending chains eventually stabilise. Essentially, the Ascending Chain Condition 
guarantees that the least fixed point computation always terminates. Due to 
the use of negation in the logic, we need to introduce a complement operator, 
C, in the underlying complete lattice. The only condition that we impose on 
the complement is anti-monotonicity i.e. S £ : li C l 2 CZx 3 C^ 2 , 

which is necessary for establishing Moore Family result. The following definition 
introduces the syntax of LLFP. 

Definition 2. Given fixed countable and pairwise disjoint sets X and y of vari- 
ables, a non-empty and finite universe U, a complete lattice satisfying Ascending 
Chain Condition £, finite alphabets 1Z and T of predicate and function symbols, 
respectively, we define the set of LLFP formulae (or clause sequences), els, to- 
gether with clauses, cl, preconditions, pre, terms u and lattice terms V and V 
by the grammar: 

u ::= x | a 

V ::= Y | [u] 

V ::=V\f(V) 

pre ::= R(u; V) | -iR(u; V) | Y(u) | pre\ l\prei \ pre\ V prei 

3x : pre \ 3Y : pre 
cl ::= R(u; V')\l\ ch A cl 2 | pre =*» cl | Va; : cZ | VY : cl 
els ::= cl\, . . . , cl s 

Here x e X , a e U, Y E y, R G 1Z, f E T , and s > 1. Furthermore, u and V 
abbreviate tuples (ui, . . . , Uk) and (V[, . . . , VI) for some k > 0, respectively. 

We write fv(-) for the set of free variables in the argument •. Occurrences of 
R(u; V) and ^R(u; V) in preconditions are called positive , resp. negative , queries 
and we require that fv(u) C X and fv(V) C y U X; these variables are defin- 
ing occurrences. Occurrences of Y(u) in preconditions must satisfy Y E y and 
fv{u) C X; Y is an applied occurrence, u is a defining occurrence. Clauses 
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of the form R(u; V') are called assertions; we require that fv(u) C X and 
fviy') C ^ U X and we note that these variables are applied occurrences. A 
clause cl satisfying these conditions together with fv(cl) = is said to be well- 
formed; we are only interested in clause sequences els consisting of well-formed 
clauses. 

In order to ensure desirable theoretical and pragmatic properties in the pres- 
ence of negation, we impose a notion of stratification similar to the one in Data- 
log |1|5| . Intuitively, stratification ensures that a negative query is not performed 
until the predicate has been fully asserted. This is important for ensuring that 
once a precondition evaluates to true it will continue to be true even after further 
assertions of predicates. 

Definition 3. The formula els = eh, ■ ■ ■ ,cl s is stratified if there exists a func- 
tion rank : 1Z — > {0, • • • , s} such that for all i = 1, • • • , s: 

— rank(i?) = i for every assertion R in clt; 

— rank(i?) < i for every positive query R in cU; and 

— rank(i?) < i for every negative query ->i? in cU. 

The following example illustrates the use of negation in the LLFP formula. 

Example 2. Similarly to Example [TJ we can define equality E and non-equality 
N predicates in LLFP as follows 

(V.t : E(x; [x])), (Vx : VF : -^E(x; Y) N(x; Y)) 

According to Definition [3] the formula is stratified, since predicate E is fully 
asserted before it is negatively queried in the clause asserting predicate N. As a 
result we can dispense with an explicit treatment of = and ^ in the development 
that follows. On the other hand the Definition [3] rules out 

(V.t : VF : -nP(a;; Y) Q(x; Y)), (V.t : VF : ~^Q(x] Y) => P(x; Y)) 

To specify the semantics of LLFP we introduce the interpretations g, ^ and £ 
of predicate symbols, variables and function symbols, respectively. Formally we 
have 

In the above TZ/ k stands for a set of predicate symbols of arity fc, and 1Z is a 
disjoint union of lZ/ k , hence 1Z = \S k TZ/k- Similarly, F/ k is a set of function 
symbols of arity k over the complete lattice C. The set T is then defined as 
disjoint unions of J- '/ k ; hence J- = 1+J fc J-/ k . The interpretation of variables from 
X is given by <;) = ^(a;), where s(x) is the clement from U bound to x G X. 

Analogously, the interpretation of variables from y is given by [^](C, = <?(X)> 
where s(Y) is the element from £^± = £ \ {!_} bound to Y G y. We do not 
allow variables from y to be mapped to _L in order to establish a relationship 
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between ALFP and LLFP in the case of powerset lattice, i.e. V{U), which we 
briefly describe later. In order to give the interpretation of [u], we introduce a 
function /3 : U — >• C. The /3 function is called a representation function and the 
idea is that (3 maps a value from the universe IA to the best property describing it. 
For example in the case of a powerset lattice, /3 could be defined by /3(a) = {a} 
for all a£W. Then the interpretation is given by [[m]](Ci^) = /3(M(C, 0)- The 
interpretation of function terms is defined as I/(V)](C,c) = C(/)([^"'](Ci ?))• 
For the functions we require that £(/) : C — > £ is monotone. The interpretation 
of terms is generalized to sequences u of terms in a point-wise manner by taking 
[oKCO = a for all a e W, thus [( Ulj . . . = (M(C,0, ••• . M(CO)- 

The interpretation of lattice terms V (and V) is generalized to sequences V 
(and V') of lattice terms in the similar way. 

The satisfaction relations for preconditions pre, clauses cl and clause se- 
quences els are specified by: 

(Q,s) h/3 P re , (Q,(,s) \=p cl and (g, (, ?) \=p els 

The formal definition is given in Table [1] here <;[x <— > a] stands for the mapping 
that is as ? except that x is mapped to a and similarly q[Y n> Z] stands for the 
mapping that is as <: except that Y is mapped to Z G Cjt±. 



Table 1. Semantics of LLFP 



MI «(«)(?(«)) 3 ? (V) 
(e,?) (=/,-.«(«; v) iff C( e (Ji)(?(u))) □ ?(V) 

(£>,?) N/3 P re i Apre 2 iff. (£>,?) N/3 pre 1 and N/3 pre 2 

(Q, N/3 P re i V pre 2 iff (g, c) N/3 P re i or ?) N/3 P re 2 

(ffi N/3 3a; : pre iff (g, <j[a; N/3 P re f° r some a £U 

(f?i ?) N/3 3^ : pre iff (g, c[Y M- Z]) pre for some Z G £^x 

(e, C, ?) N/3 V) iff e(i?)(M(c, 0) 3 [v'KC, 

(e,C>?) N/3 1 iff true 

(p, C> N/3 cZi a cZ 2 iff (g, c, c) N^ and (e> C. ?) N/3 cz 2 

(p> C> N/3 P re =>■ cZ Mi (ffj Ci N/3 cZ whenever (f>, ?) N/3 pre 
(f?i C> ?) N/3 Vx : cZ iff (g, f, ?[x >->■ a]) cZ for all a G W 
(e, C. ?) N/3 VF : cZ iff ( e , C, <r[F i-> Z]) N/3 cZ for all Z G C^i_ 

Cj ?) N/3 cZi , ■ ■ ■ , cZ s iff (e, C, ?) N/3 cZi for alii, 1 < i < s 



Relationship to ALFP. As reader may have already noticed, in the case the 
underlying complete lattice is V(U) the two logics are essentially equivalent. 
More precisely, in the case of powerset lattice, V(U), function (3 given by /3(a) = 
{a} for all a G U, and without function terms we can translate LLFP formula 
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into a corresponding ALFP one and vice versa. Intuitively, we get the following 
correspondence between interpretations of relations 

Va,6 : ((a, 6) G p(R) & g(R)(a) D {b}) 

The idea is that a relation R in LLFP with interpretation g(R) G U k — » V(U) 
is replaced by a relation in ALFP (also named R) with interpretation p(R) G 
V(Ll k+1 ). Note that if g(R)(a) = _L then p(R) does not contain any tuples with 
a as the first k components. 

Interval analysis in LLFP. Now let us give an LLFP specification of interval 
analysis. The analysis is defined by the predicate A. Similarly to Datalog or 
ALFP, the specification is defined over a universe U, which in this case is a set 
of all variables, Var, appearing in the program as well as states in the under- 
lying program graph. In addition, the LLFP logic allows interpretations over 
complete lattices satisfying Ascending Chain Condition. Here we use the lattice 
(Interval, ^/), defined in Section [2] 

The specification consists of the initialization clauses and clauses correspond- 
ing to three types of actions in the underlying program graph. First, for the initial 
state, qo, wc initialize all variables in the program graph with the T element, 
denoting that they may have all possible values 

A A (lo, v;T) 

u£ Var 

Furthermore, whenever we have q s — — ^> qt in the program graph we generate 

\/i y : \fi z : A(q s ,y;i y ) A A(q s ,z;i z ) =>• A(q t ,x; f*(i v ,i z )) A 
Vv : Vi : v ^ x A A(q s ,v; i) A(q t ,v; i) 

The first conjunct updates the possible interval of values for the assigned variable 
(in that case for variable x), with the result of evaluating the arithmetic operation 
y-kz. The second conjunct propagates the analysis information for all variables 
except variable x without altering it. Furthermore, whenever we have q s qt 
or q s sktp ) q t j n the program graph, we generate a clause 

Vw : Vi : A(q s ,v, i) =>• A(q t ,v, i) 

which simply propagates the analysis information along the edge of the program 
graph, without making any changes. 

4 Moore family result for LLFP 

In this section we establish a Moore family result for LLFP that guarantees that 
there always is a unique best solution for LLFP clauses. 
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Definition 4. A Moore family is a subset Y of a complete lattice C = (£, C) 
that is closed under greatest lower bounds: VY' C Y : [~ | Y' G Y . 

It follows that a Moore family always contains a least element, an d a 

greatest element, [~| 0, which equals the greatest element, T, from £; in particular, 
a Moore family is never empty. The property is also called the model intersection 
property, since whenever we take a meet of a number of models we still get a 
model. 

Assume els has the form cl%,..., cl s , and let A = {g : Ylk ^-/fc U k ^ £} 
denote the set of interpretations g of predicate symbols in TZ. We also define 
the lexicographical ordering < such that gi ^ g 2 if and only if there is some 
1 < j < s. where s is the order of the formula, such that the following properties 
hold: 

(a) gi (R) = g 2 (R) for all R G TZ with rank(i?) < j, 

(b) gi(R) C g 2 (i?) for all R G ft with rank(i?) = j, 

(c) cither j = s or Qx(R) C Q2(R) for at least one R e TZ with rank(i?) = j. 

We say that Qi(R) E £>2(^) if and only if Va G U k : gi(R)(a) C g 2 (R)(a), where 
k > is the arity of i?. Notice that in the case s = 1, the above ordering coincides 
with lattice ordering Intuitively, the lexicographical ordering ^ orders the 
relations strata by strata starting with the strata 0. It is essentially analogous to 
the lexicographical ordering on strings, which is based on the alphabetical order 
of their characters. 

Lemma 1. ^ defines a partial order. 

Proof. See Appendix [A"l 

Assume cls has the form cl\,--- ,cl s where clj is the clause corresponding 
to stratum j, and let IZj denote the set of all relation symbols R defined in 
ch, ■ ■ ■ , clj taking TZo = 0. Let M G A denote a set of assignments which map 
relation symbols to relations. 

Lemma 2. A = (Z\, ^) is a complete lattice with the greatest lower bound given 
by 

(fl^M) (R) = Aa.|~| {g(R)(a) \ g G M rMlk(fi) } 

where 

Mj = j g G M | Vi?' rank(i?') < j : g(R') = ([~|^ M ) ( R> )} 
Proof. See Appendix [B] 

Note that i s wei l defined by induction on j observing that Mo = M 

and Mj G Mj-\. 

Proposition 1. Assume cls is a stratified LLFP clause sequence, <;o and £o are 

interpretations of free variables and function symbols in cls, respectively. Fur- 
thermore, go is an interpretation of all relations of rank 0. Then {g \ (g, Coi ^o) \=f3 
cls A Vi? : rank(i?) = => £>o(i?) E q(R)} * s a Moore family. 
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Proof. See Appendix [Cl 

The result ensures that the approach falls within the framework of Abstract 
Interpretation |7l8j : hence we can be sure that there always is a single best 
solution for the analysis problem under consideration, namely the one defined 
in Proposition [TJ 

5 The Algorithm 

In this section we present the algorithm for solving LLFP clause sequences, 
which extends the differential worklist algorithm by Nielson et al. |18I17| . The 
algorithm computes the relations in increasing order on their rank and there- 
fore the negations present no obstacles. It completely abandons a worklist-likc 
data structures, which are typical for most classical iterative fixpoint algorithms 
[12] . Instead, we adapt the recursive topdown approach of Le Charlicr and van 
Hcntenryck [5] which is enhanced by continuation based semi-naive iteration 



In the following we assume that prior to solving the LLFP formula, all the 
clauses are transformed into a form such that all applied occurrences of variables 
Y € y in preconditions, i.e. Y(u), are not followed by their defining occurrences, 
i.e. R(u; Y) and -<R(u;Y). This is necessary to correctly perform late bindings 
of variables Y G y in the presence of Y{u) construct. 

The algorithm operates with (intermediate) representations of the two in- 
terpretations s arid q of the semantics; we shall call them env and result, 
respectively, in the following. The data structure env is supplied as a parameter 
to the functions of the algorithms, and it represents partial environment. The 
data structure result is an imperative data structure that is updated as we 
progress. 

The partial environment env is implemented as a map from variables to their 
optional values. In the case the variable is undefined it is mapped into None. 
Otherwise, depending on its type it is mapped to Some(a) or Some(l), which 
means that the variable is bound to a € U, or I £ respectively. The main 

operation on env is the function unify, defined as follows 



It uses two auxiliary functions that perform unifications on each component of 
the relation. For the first component, which ranges over the universe IA, the 
function is given by 



mm- 



UNIFy(/3, env, (u; V), (a; /)) = 



if UNlFYu(env, u, a) = fail 

UNlFY L (/3, env', V, I) if UNIFYu(env, it, a) = env' 




It performs a unification of an argument u with an clement agMin the environ- 
ment env. In the case when the unification succeeds the modified environment 
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UNlFY L (/3, env, V, I) 



is returned, otherwise the function fails. The funcion is extended to A:-tuples in 
a straightforward way. The definition of the unification function for the lattice 
component is given by 

{env[F i y Some(l n l v )]} 

if V G y A emr[V] = Some(l v ) A I n l v ^ _L 
{env[V M> 5ome(Z)]} 

if V G 3> A env[V] = iVone A I ^ _L 
{env} if V = [it] A 

((it G Af A env[u] = Some(a)) V u = a) A (3(a) C Z 
{env[« n> S'ome(a)] | /3(a) C Z} 

if V" = [it] A u G Af A env[u] = Afone 
otherwise 

The function is parametrized with j3 : U —¥ C, defined in Section [3] It performs 
a unification of an lattice term V with an clement / e C in the environment 
env. In the case when the unification succeeds the set of unified environments is 
returned, otherwise the function returns empty set. 

The other important operation on the partial environment is given by the 
function UNIFIABLE. The function when applied to env and a tuple (it; V), re- 
turns a set of tuples for which unify would succeed. The function is defined by 
means of two auxiliary functions, formally we have 

UNIFIABLE (env, (it; V)) — (uNIFlABLEu(env, it); UNIFlABLE L (env, V)) 



where 



UNIFIABLEu (env, It) = 



J {a} if (u G X A envkt] = Some(a)) V u 



U if u G X A envkt] = None 



UNIFIABLEl (env, V) = < 



and 

I if V G y A env[F] = Some{l) 

T if V e y A env[V] = None 

/3(a) if V = [it] A (u = aV 

(u G X A env[u] = Some(a))) 
U{/3(a) I a e U] if V = [u] A u G XA 
env[it] = None 

im) ifv=/(v)A 

/ = unifiablel (env, V) 

Both auxiliary funcions are extended to fc-tuples in a straightforward way. 

The global data structure result, which is updated incrementally during 
computations, is represented as a mapping from predicate names to the prefix 
trees that for each predicate R record the tuples currently known to belong to 
R. There are three main operations on the data structure result: the operation 
result. HAS checks whether a given tuple is associated with a given predicate, 
the operation re suit. SUB returns a list of the tuples associated with a given 
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predicate and the operation result .add adds a tuple to the interpretation of a 
given predicate. 

Since g is updated as the algorithm progresses, it may happen that a query 
R(v; V) inside a precondition fails to be satisfied at the given point in time, but 
may hold in the future when a new tuple (a; I) is added to the interpretation of 
R. If we are not careful we may lose the consequences that adding (a; I) to R will 
have on the contents of other predicates. This gives rise to the data structure 
inf 1 that records computations that have to be resumed for the new tuples; 
these future computations are called consumers. The inf 1 data structure is also 
represented as a mapping from the predicate names to prefix trees that for each 
predicate R record consumers that have to be resumed when the interpretation 
of R is updated. There are two main operations on the data structure infl: 
the operation inf 1 . register that adds a new consumer for a given predicate 
and inf 1 . consumers that returns all the consumers currently associated with 
a given predicate. 

In the algorithm, we have one function for each of the three syntactic cate- 
gories. The function solve takes a clause sequence as input and calls the function 
execute on each of the individual clauses 

SOLVE(di, . . . , cl s ) = EXECUTE(di)[ ]; . . . ; execute(cZ s ) [ ] 

where we write [ ] for the empty environment reflecting that we have no free 
variables in the clause sequences. 

Let us now turn to the description of the function execute. The function 
takes a clause cl as a parameter and a representation env of the interpretation 
of the variables. We have one case for each of the forms of cl] the pseudo code 
is given in Figure [TJ Let us explain the case of an assertion first. The algorithm 

EXECUTE(7?(u; V))env = 
let iterFun (a; I) = 

match result . HAS(i?, (a; I)) with 
| true — ► () 
| false — > 

result. ADD(_R, (a; I)) 

iter (fun / — > f (a; I)) (inf 1 . CONSUMERS R) 
in iter iterFun (UNIFIABLE(env,(l>; V))) 
EXECUTE(l)env = () 

EXECUTE(di Ac^env = EXECUTE(di)env; EXECUTE(d2)env 
EXECUTE(pre =>■ cZ)env = CHECK(pre, EXECUTE(cZ))env 
execute(Vx : c()env = EXECUTE(d)(env[x \-¥ None}) 

Fig. 1. The execute function. 

uses the auxiliary function iter, which applies the function iterFun to each 
element of the list of tuples that can be unified with the argument (v;V). Given 
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a tuple (a; I), the function iterFun adds the tuple to the interpretation of R 
stored in result if it is not already present. If the add operation succeeds, we 
first create a list of all the consumers currently registered for predicate R by 
calling the function infl . CONSUMERS. Thereafter, we resume the computations 
by iterating over the list of consumers and calling corresponding continuations. 
The cases of always true clause, 1, is straightforward; the function simply returns 
the unit, without performing any other actions. In the case of the conjunction 
of clauses the algorithm calls the execute function for both conjuncts and 
the current environment env. In the case of implication we make use of the 
function CHECK that in addition to the precondition and the environment also 
takes the continuation execute (cl) as an argument. In the case of universal 
quantification, we simply extend the environment to record that the value of the 
new variable is unknown and then we recurse. The case of universal quantification 
over a variable Y £ y is exactly the same and hence omitted. 

Now, let us present the function check. It takes a precondition , a contin- 
uation and an environment as parameters. The pseudo code is given in Figure 
[2] In the case of positive queries we first ensure that the consumer is registered 

CHECK(i?(v; V"), nea;t)env = 
let CONSUMER (a; I) = 

match UNlFY(env, (u; V), (a; I)) with 
| fail -> () 

| envs —¥ iter next envs 
in inf 1 .REGISTER^, CONSUMER); ITER CONSUMER (result. SUB R) 
CHECk(-i7?(d; V), next)env = 
let iterFun (a; I) = 

match result . HAS(i?, (a; I)) with 
| true —> () 

| false — ► iter next (uNlFY(env, (v; V), (a; I))) 
in iter iterFun (unifi able (env, (v; V))) 
CHECk(Y(x), next) env = 

let env' = if env(Y) = Some(l) then env else env[F M> T] 
in let F a = if Some(/3(a)) IZ env'(F) then next env' [a; h-> a] else () 
in match env ' (x) with 
| Some(a) — > F a 
| None — > ITER F U 
CHECK(prei A pre2, next)env = CHECK(prei, CHECK(pre2, next))eziv 
CHECK(prei V pre2, next)env = CHECK(prei, next)env; CHECK(pre2, next)env 
CHECK(3a; : pre,next)env = CHECK(pre, next o (remove x))(env[x M> None]) 

Fig. 2. The check function. 

in infl, by calling function register, so that future tuples associated with R 
will be processed. Thereafter, the function inspects the data structure result 
to obtain the list of tuples associated with the predicate R. Then, the auxiliary 
function CONSUMER unifies (v; V) with each tuple; and if the operation succeeds, 
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the continuation next is invoked on each of the updated new environments in 
the returned set envs. In the case of negated query, the algorithm first computes 
the tuples unifiablc with (v; V) in the environment env. Then, for each tuple it 
checks whether the tuple is already in R and if not, the tuple is unified with 
(v; V) to produce set of new environments. Thereafter, the continuation next 
is evaluated in each of the environments contained in the returned set. Notice 
that in the case of negative queries we do not register a consumer for the re- 
lation R. This is because the stratification condition introduced in Definition [3] 
ensures that the relation is fully evaluated before it is queried negatively. Thus, 
there is no need to register future computations since the interpretation of R 
will not change. Now, let us consider function check in the case of Y(x), where 
x G X. The function begins with creating an environment env' that is exactly 
as env except that the binding for the variable Y is set to T in the case Y is 
undefined in env. Then, we define an auxiliary function that checks whether 
env' (Y) over-approximates the abstraction of an argument a, denoted by /3(a), 
and if so the continuation is called in the environment env ' [x <— >• a] . Finally, the 
function checks the binding for the variable x in the environment env ' and if it 
is bound to Some (a) the function F applied to a is called. Otherwise, the func- 
tion F is called for each element of the universe, using the iter function. The 
case of Y(a), where a G U is essentially the same as the case explained above, 
except that we do not have to handle the case when x G X is undefined in env. 
For conjunction of preconditions we exploit a continuation passing programming 
style. More precisely, we call the check function for the precondition prei, and 
as a continuation we pass a call to the CHECK function partially applied to the 
precondition pre2 and the continuation next. In the case of disjunction of pre- 
conditions the function simply checks preconditions pre\ and pre?, respectively 
in the current environment env. In order to be efficient we use memoization; this 
means that if both checks yield the same bindings of variables, the second check 
does not need to consider the continuation, as it has already been done. The 
algorithm for existential quantification checks the precondition pre in the envi- 
ronment extended with the quantified variable. The continuation that is passed 
is a composition of functions next and remove x, where the function remove 
removes variable passed as a first argument from the environment passed as a 
second argument. In order to be efficient we again use a memoization to avoid 
redundant computations. The case of existential quantification over a variable 
Y G y is exactly the same and hence omitted. 



6 Conclusions and Future Work 

In the paper we introduced the LLFP logic, which is an expressive formalism for 
specifying static analysis problems. It lifts the limitation of logics such as Datalog 
and ALFP by allowing interpretation over complete lattices satisfying Ascending 
Chain Condition. Thanks to the declarative style, the analysis specifications are 
easy to analyse for their correctness. 
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We established a Moore Family result that guarantees that there always is 
a unique best solution for the LLFP formulae. More generally this ensures that 
the approach taken falls within the general Abstract Interpretation framework. 
We also developed a state-of-the-art solving algorithm for LLFP, which is a con- 
tinuation passing style algorithm, which represents relations as prefix trees. We 
showed that the logic and the associated solver can be used for rapid proto- 
typing of sophisticated static analyses by presenting the formulation of interval 
analysis. 

As a future work we plan to implement a front-end to automatically ex- 
tract analysis relations from program source code, and perform experiments on 
real-world programs in order to evaluate the performance of the LLFP solver. 
Furthermore, we would like to lift the Ascending Chain Condition and use e.g. 
widening operator [9110) in order to ensure termination of the least fixed point 
computation. 
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These appendices are not intended for publication and references to them 
will be removed in the final version. 

A Proof of Lemma [1] 

Proof. Reflexivity Vp G A : g ^ g. 

To show that g ^ g let us take j = s. If rank(i?) < j then g(R) = g(R) as 
required. Otherwise if rank(i?) = j then from g(R) = g{R) we get g(R) E g(R)- 
Thus we get the required g ^ g. 

Transitivity Vft, ft, ft € A : g x ■< g 2 A Q2 ^ ft => ft :< ft- 
Let us assume that ^ 02 A 02 ^ ft- From ^ we have ji such that 
conditions (jlj)-(jcj) are fulfilled for i = 1,2. Let us take j to be the minimum of j\ 
and j 2 . Now we need to verify that conditions (Jaj)-(Jcj) hold for j. If rank(i?) < j 
we have gi(R) = g2(R) and g2(R) = ft(^?)- It follows that gi(R) = g3(R), hence 
(jaj) holds. Now let us assume that rank(i?) = j. We have gi(R) E g2(R) and 
ft(-R) E ft(-R) and from transitivity of E we get g\{R) E ft(-R), which gives 
(jbj. Let us now assume that j ^ s, hence gi(R) C for some R G TZ and 

i = 1,2. Without loss of generality let us assume that gi(R) C g 2 (R)- We have 
gi(R) C ft(-R) and ft(-R) E ft(-R), hence ft(i?) C ft(-R), and (j(?j) holds. 
Anti-symmetry Vgi, ft G : ft ^ ft A ft ^ ft =>■ ft = ft- 
Let us assume ^ 02 and 02 ^ £1 • Let j be minimal such that rank(i?) ~ j and 
gi(R) 7^ ft(-R) for some R £ 1Z. Then, since rank(i?) = j, we have gi(R) E g2(R) 
and ft E gi(R)- Hence gi(R) = ft(i?) which is a contradiction. Thus it must 
be the case that ft(-R) = ft(-R) for all R ElZ. □ 

B Proof of Lemma [2] 

Proof. First we prove that fl^^ i s a lower bound of M; that is fl/i-^ ^ Q f° r 
all g 6 M. Let j be maximum such that g G Afj-; since Af = Mo and Mj D Mj+x 
clearly such j exists. From definition of Mj it follows that (\~~\aM)(R) = g(R) for 
all R with rank(i?) < j; hence (jaj) holds. If rank(i?) = j we have (\~~\aM)(R) = 
Xa. \~\{g'(R)(a) \ g' G Mj} E g{R) showing that (fbj) holds. Finally let us assume 
that j ^ s; we need to show that there is some R with rank(i?) = j such that 
(\~\aM)(R) C g(R). Since we know that j is maximum such that g £ Mj, it 
follows that g ^ Mj + i, hence there is a relation R with rank(i?) = j such that 
(T\aM)(R) C thus (jej holds. 

Now we need to show that f]/! Af is the greatest lower bound. Let us assume that 
g' < g for all g G M, and let us show that g' < \~\aM . If g' = \~\aM the result 
holds vacuously, hence let us assume g' j^\~~\aM. Then there exists a minimal j 
such that (J~~\aM)(R) ^ g'(R) for some R with rank(i?) = j. Let us first consider 
R such that rank(i?) < j. By our choice of j we have (|~lziM)(i?) = g'(R) hence 
(jaj) holds. Next assume that rank(i?) = j. Since we assumed that g' < g for all 
g G M and Mj C M, it follows that g'(R) E £>(#) for all g G Mj. Thus we have 
g'(R) E Aa.r]{ £) (i?)(a) | g G Mj}. Since (fUM)(i?) = Xa.f\{g(R)(a) \ g G 
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Mj}, we have g'(R) E (\~~\aM)(R) which proves (jb|. Finally since we assumed 
that g'(R) ^ ([]aM)(R) for some R with rank(i?) = j, it follows that jc} holds. 
Thus we proved that g' <V\aM. □ 

C Proof of Proposition [1] 

In order to prove Proposition [T] we first state and prove two auxiliary lemmas. 

Lemma 3. If g = \~~\aM , pre occurs in clj and {g, <r) \=p pre then also (g' , ?) 1=^ 
pre for all g' G Mj . 

Proof. We proceed by induction on j and in each case perform a structural 

induction on the form of the precondition pre occurring in clj . 

Case: pre = R(u; V) 

Let us take g = \~\aM and assume that 

ha R(u;V) 

From Table Q] we have: 

q(RMu)) □ ? (V) 

Depending on the rank of R we have two cases. If rank(i?) = j then g(R) = 
Xa. \~\{g'(R)(a) \ g' G Mj} and hence we have 

\~\{q'(R)(<;(u)) I g' e Mj} □ ? (V) 

It follows that for all g' G Mj 

g'(RMu)) □ ? (V) 

Now if rank(i?) < j then g(R) = g'(R) for all g 1 G Mj hence we have that for 
all g' G Mj 

g'(R)(,{u)) □ ,{V) 
which according to Table [T] is equivalent to 

Vg' eM r .(g'^) h3 R(u;V) 

which was required and finishes the case. 
Case: pre = Y(u) 

Let us take g = \~~\aM and assume that 

(ft?) \=pY(u) 
According to the semantics of LLFP in Table Q] we have 

£(«(«)) e 

It follows that 

V G : £(*(«)) E 
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which according to the semantics of LLFP in Tabic Q] is equivalent to 

Ve'6M 3 -:(j' )C ) \=pY(u) 

which was required and finishes the case. 

Case: pre = -^R(u; V) 

Let us take g = \~~\aM and assume that 

From Table [T] we have: 

Since rank(i?) < j then we know that g(R) = g'(R) for all g' £ Mj hence we 
have that 

Vf?' e Mj : C(e(fl)( s («))) □ 

Which according to Table [T] is equivalent to 

Vg' eAI r .(g',^) \= p -,R(u;V) 

which was required and finishes the case. 

Case: pre = pre\ A pre2 

Let us take g = \~~\aM and assume that 

(Q,s) h/3 P re i Apre 2 
According to Table Q] we have 

(e>?) h/3 P re i 

and 

From the induction hypothesis we get that for all g' £ Mj 
and 

It follows that for all p' 6 Mj 

|=/3 P re i A_pre 2 

which was required and finishes the case. 

Case: pre = prei V prei 

Let us take g = \\aM and assume that 
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According to Tabic Q] we have 

(ft?) h/3 P re i 

or 

(ft?) h/3 P re 2 

From the induction hypothesis we get that for all £>' £ Mj 
or 

It follows that for all g' 6 Mj 

(e',?) h/3 P re i Vpre 2 

which was required and finishes the case. 

Case: pre = 3x : pre' 

Let us take g = \~\aM and assume that 

(ft?) h/3 3x : pre' 

According to Table [T] we have 

3a £ U : (ft q[x ^ a]) \=p pre' 
From the induction hypothesis we get that for all g' £ Mj 

3a £ U : (g', <;[x t-s> a]) \=p P re ' 

It follows from Table Q] that for all g' £ Mj 

{q',s) h/3 3a; :pre' 

which was required and finishes the case. 

Case: pre = 3Y : pre' 

Let us take g = \~~\aM and assume that 

(ft?) h^ :pre' 

According to Table Q] we have 

31 e£^± : (ft?[F ^l}) \=p pre' 
From the induction hypothesis we get that for all g' £ Mj 

31 £C^ : (g',<;[Y ^ I]) \= p pre' 
It follows from Table [1] that for all g' £ Mj 

(g',s) h/3 3Y:pre' 
which was required and finishes the case. 
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Lemma 4. If g — \~\aM and (g', £, \=p clj for all g' £ M then (g, £, <;) \=p clj. 

Proof. We proceed by induction on j and in each case perform a structural 
induction on the form of the clause occurring in clj . 
Case: clj = R(u; V) 
Assume that for all g 1 £ M 

From the semantics of LLFP we have that for all g' £ M 

e '(i?)(M(C,0)^M(C,0 

It follows that: 

n{^)(H(c,o)k'GM}3[vi(c )? ) 

Since Mj C M, we have: 

n{eWM(C,0) I e m,} □ IV](C,0 

We know that rank(i?) = j; hence g(R) = \a. \~~\{g'(R)(a) \ g' £ Mj}; thus 

e(fl)([ul(c,0) = |~>WH(C,0) I d g m 3 -} □ iv](c,0 

Which according to Table [1] is equivalent to 

(e,C?) N fl(«;V0 

Case: = c/i A 

Assume that for all g>' G M: 

From Table Q] it is equivalent to 

(e'jC.O h=/3 cZi and (g' |=/3 c ^2 
The induction hypothesis gives that 

(£>,C0 h/3 cZi and (e,Ci?) N c ^2 
Which according to Table [1] is equivalent to 

(g,C,<;) \=p ch Acl 2 

and finishes the case. 
Case: clj = pre =>■ c/ 
Assume that for all g' £ M: 

(q',C,s) N P re =^ cZ (!) 
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We have two cases. In the first one (g, q) \=p pre is false, hence (g, q, C) \=p 
pre cl holds trivially. In the second case let us assume: 

(2,0 \=p pre (2) 
Lemma [3] gives that for all g' G Mj 

(e',s) \=p pre 
From (JTJ) we have that for all g 1 € Mj 

and the induction hypothesis gives: 

Hence from ([2]) we get: 

(£>, C, ?) h/3 ?we 
which was required and finishes the case. 
Case: clj = Vx : cl 
Assume that for all g' £ M 

From Table [T] we have that for all g 1 S M and for all a G W 

(e',Ci?[a: H- a]) (=g d 
Thus from the induction hypothesis we get that for all a G U 

(q,C,<;[x h- a]) \=p cl 

According to Table [T] it is equivalent to 

(Q, C, ?) h/3 Vx : d 

which was required and finishes the case. 
Case: cl = VY : d 
Assume that for all g' € M 

(</,£?) NVYrd 
From Tabled] we have that G M 

Thus from the induction hypothesis we get that 

W6^i;(ftC,#^(]) h» c/ 

According to Table [1] it is equivalent to 

(e,C?) Nvr :d 

which was required and finishes the case. 
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Proposition [H Assume els is a stratified LLFP clause sequence, an d Co are 
interpretations of free variables and function symbols in els, respectively. Fur- 
thermore, qq is an interpretation of all relations of rank 0. Then {g \ (g, Co, ^o) \=/3 
els A Vi? : rank(i?) = => g (R) Q g(R)} is a Moore family. 



Proof. The result follows from Lemma SJ 



□ 



